Solving for Best Responses in Extensive-Form Games using Reinforcement Learning Methods

نویسندگان

Amy Greenwald

Jiacui Li

Eric Sodomka

Michael Littman

چکیده

We present a framework to solve for best responses in extensive-form games (EFGs) with imperfect information by transforming the games into Information-Set MDPs (ISMDPs), and then applying simulation-based reinforcement learning methods to the ISMDPs. We first show that, from the point of view of a single player, an EFG can be represented as an Information-Set POMDP (ISPOMDP) whose states correspond to the nodes in the EFG. This ISPOMDP can then be further represented as an ISMDP, whose states correspond to the information sets in the EFG. Because the transformations are lossless, every optimal policy in the ISMDP is a best response in the original EFG. Our approach to finding a best response in an EFG, therefore, is to first apply the aforementioned transformations, and to then use simulation to learn the ensuing ISMDP and standard techniques (e.g., dynamic programming) to solve it. There are two challenges to effectively learning the ISMDP through simulation: the ISMDP state space is exponential in the horizon, and we cannot resample actions during simulation. We prove that simulation can still be guaranteed to learn near-optimal best responses with high probability, although the sample complexity depends explicitly on the size of the state space. Using our best-response finding algorithm as a subroutine, we further develop two algorithms, one that implements approximate best-reply learning dynamics, and another that approximates -factors of strategy profiles in EFGs. We evaluated these algorithms by applying them to several sequential auction domains.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Solving for Best Responses and Equilibria in Extensive-Form Games with Reinforcement Learning Methods

We present a framework to solve for best responses and equilibria in an extensive-form game (EFG) of imperfect information by transforming the game into a set of Markov decision processes (MDPs), and then applying simulation-based reinforcement learning to those MDPs. More specifically, we first transform a turn-taking partially observable Markov game (TT-POMG) into a set (one per player) of pa...

متن کامل

An Exact Double-Oracle Algorithm for Zero-Sum Extensive-Form Games with Imperfect Information

Developing scalable solution algorithms is one of the central problems in computational game theory. We present an iterative algorithm for computing an exact Nash equilibrium for two-player zero-sum extensive-form games with imperfect information. Our approach combines two key elements: (1) the compact sequence-form representation of extensiveform games and (2) the algorithmic framework of doub...

متن کامل

A reinforcement learning process in extensive form games

The CPR (“cumulative proportional reinforcement”) learning rule stipulates that an agent chooses a move with a probability proportional to the cumulative payoff she obtained in the past with that move. Previously considered for strategies in normal form games (Laslier, Topol and Walliser, Games and Econ. Behav., 2001), the CPR rule is here adapted for actions in perfect information extensive fo...

متن کامل

Multiagent Reinforcement Learning in Stochastic Games

We adopt stochastic games as a general framework for dynamic noncooperative systems. This framework provides a way of describing the dynamic interactions of agents in terms of individuals' Markov decision processes. By studying this framework, we go beyond the common practice in the study of learning in games, which primarily focus on repeated games or extensive-form games. For stochastic games...

متن کامل

Using games in Oncology Teaching

Introduction:Educational methods can be classified into two groups: active methods and passive ones. Applying games is an active approach in teaching. The present study aimed at investigating the effect of games on teaching oncology. Methods: Twenty three medical students participated in the study. They took two class sessions of oncology. In the first session the basic principles and concepts...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Solving for Best Responses in Extensive-Form Games using Reinforcement Learning Methods

نویسندگان

چکیده

منابع مشابه

Solving for Best Responses and Equilibria in Extensive-Form Games with Reinforcement Learning Methods

An Exact Double-Oracle Algorithm for Zero-Sum Extensive-Form Games with Imperfect Information

A reinforcement learning process in extensive form games

Multiagent Reinforcement Learning in Stochastic Games

Using games in Oncology Teaching

عنوان ژورنال:

اشتراک گذاری